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ABSTRACT 

We construct a catalogue for filaments using a novel approach called SCMS (subspace 
constrained mean shift; Ozertem & Erdogmus 2011; Chen et al. 2015). SCMS is a 
gradient-based method that detects filaments through density ridges (smooth curves 
tracing high-density regions). A great advantage of SCMS is its uncertainty measure, 
which allows an evaluation of the errors for the detected filaments. To detect filaments, 
we use data from the Sloan Digital Sky Survey, which consist of three galaxy samples: 
the NYU main galaxy sample (MGS), the LOWZ sample and the CMASS sample. 
Each of the three dataset covers different redshift regions so that the combined sample 
allows detection of filaments up to z = 0.7. 

Our filament catalogue consists of a sequence of two-dimensional filament maps 
at different redshifts that provide several useful statistics on the evolution cosmic 
web. To construct the maps, we select spectroscopically confirmed galaxies within 
0.050 < z < 0.700 and partition them into 130 bins. For each bin, we ignore the 
redshift, treating the galaxy observations as a 2-D data and detect filaments using 
SCMS. The filament catalogue consists of 130 individual 2-D hlament maps, and 
each map comprises points on the detected filaments that describe the filamentary 
structures at a particular redshift. 

We also apply our filament catalogue to investigate galaxy luminosity and its 
relation with distance to filament. Using a volume-limited sample, we find strong 
evidence (G.lcr — 12.3 (t) that galaxies close to hlaments are generally brighter than 
those at significant distance from filaments. 

Key words: (cosmology:) large-scale structure of Universe - catalogues 


1 INTRODUCTION 

Matter in the Universe tends to be distributed in a network¬ 
like large-scale structure which is known as the cosmic web 
(Bond et al. 1996). The existence of this filamentary struc¬ 
ture has been confirmed observationally and can be repro¬ 
duced in N-body simulations (Jenkins et al. 1998; Colberg 
et al. 2005; Springel et al. 2005; Dolag et al. 2006). The 
large-scale structure comprises four distinct objects: over- 
dense clusters, interconnected filaments, widespread sheet¬ 
like walls and large empty voids. In this paper, we focus on 
cosmic filaments. 

* E-mail: yenchic@andrew.cmu.edu 


The principal approach to study large-scale structure is 
by constructing a catalogue. For galaxy clusters, several cat¬ 
alogues have been created; see, e.g., Abell catalogue (Abell 
et al. 1989), redMaPPer (Rykoff et al. 2014; Rozo & Rykoff 
2014), XCS (Menanteau et al. 2013), MCMC (Piffaretti et al. 
2011), Mantz (Mantz et al. 2010), and Planck ESZ (Planck 
Collaboration et al. 2011). However, there exists few cata¬ 
logues for filaments (a recent example can be found in Tem- 
pel et al. 2014). There are three reasons why high-quality 
filament catalogues are needed. 

First, catalogues for filaments provide a reference to 
other types of large-scale structures. It is known that galaxy 
clusters are connected by filaments; filaments are mainly dis¬ 
tributed on cosmic sheets/walls and surround large empty 
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voids. With a catalogue of filaments, it is easier to identify 
other large-scale structures. 

Second, filamentary structure at different redshifts can 
be used to probe cosmological models. In N-body simula¬ 
tions, we observe how a small fluctuation in the initial den¬ 
sity field, magnified by gravitational force over time, weaves 
matter into a web-like structure (Jenkins et al. 1998; Col- 
berg et al. 2005; Springel et al. 2005; Dolag et al. 2006). The 
evolution of the cosmic web depends on the initial condition 
of the early Universe. Thus, how filaments change as a func¬ 
tion of redshift conveys information about dark matter and 
dark energy. 

Finally, filament catalogues make it easier to study 
properties of filaments and their interaction with nearby 
galaxies. For example, recent simulations have shown that 
galaxy intrinsic alignments and luminosity are impacted by 
nearby filaments (Codis et al. 2015). Orientations of fila¬ 
ments are also found to be correlated with the shape, angu¬ 
lar momentum and peculiar velocity of dark matter haloes 
(Hahn et al. 2007b,a, 2009; Paz et al. 2008; Zhang et al. 2009, 
2013; Forero-Romero et al. 2014). Despite the abundance 
of results on simulation studies, few measurements regard¬ 
ing galaxies aligned along filaments have been obtained (see 
Jones et al. (2010) and Guo et al. (2015) for some results in 
spin alignment and luminosity). 

In this paper, we present a catalogue for filaments using 
a novel approach called SCMS (subspace constrained mean 
shift, Ozertem & Erdogmus 2011) that models filaments as 
ridges of the galaxy probability density function. SCMS first 
estimates galaxy density fields, then uses a gradient ascent 
method to detect ridges; ridges are curve-like, smooth struc¬ 
tures that characterize high-density regions. SCMS has two 
appealing properties. First, SCMS consistently detects fila¬ 
ments in the sense that the intersections between filaments 
generally are populated by galaxy clusters (See Section 5.2 
and Chen et al. 2015), as confirmed by other galaxy cluster 
detections (Rykoff et al. 2014; Rozo & Rykoff 2014). Second, 
SCMS is equipped with a statistically consistent measure of 
uncertainty (Chen et al. 2014a) that allows an evaluation of 
the error for filament detection. 

To construct the filament maps, we use a combined 
galaxy sample from Sloan Digital Sky Survey (SDSS; York 
et al. 2000; Eisenstein et al. 2011) that consists of the three 
datasets: NYU MGS, LOWZ and CMASS samples. We fo¬ 
cus on redshifts between z = 0.050 — 0.700 since the ob¬ 
served number density within this region is sufficiently high 
to generate statistically meaningful results. We first parti¬ 
tion the Universe according to redshift into thin slices of 
width Az — 0.005, then perform SCMS within each slice. 
The above process yields a series of filament maps that char¬ 
acterize the filamentary structure of the Universe at differ¬ 
ent redshifts. The variation of filament maps at different 
redshifts provides information about the evolution of the 
Universe that can be further used to constrain cosmology. 
We can construct filament maps for future photometric sur¬ 
veys (e.g. LSST) or spectroscopic surveys (e.g. WFIRST and 
Euclid) by applying SCMS to these data. 

To demonstrate the usefulness of our filament maps, 
we investigate the relationship between the luminosity of a 
galaxy and its distance to nearby filaments. We separate 
the three samples (NYU MGS, LOWZ and CMASS) and 
compare galaxy luminosity versus its distance to filaments. 


There is strong evidence that galaxies near filaments tend 
to be brighter than those away from filaments. 

In this paper, we assume a WMAP 7 ACDM cosmology 
with Ho = 70, Q,m = 0.274, and Da = 0.726 (Anderson et al. 
2012, 2014). 


2 MODELS AND METHODS 

2.1 SCMS: Detecting Filaments through Density 
Ridges 


We adopt the Subspace Constrain Mean Shift (SCMS; Oz¬ 
ertem & Erdogmus 2011) algorithm to construct filament 
maps for slices of our Universe at various redshifts. We use 
the version of SCMS described in Chen et al. (2015). SCMS 
detects filaments as galaxy density ridges (Chen et al. 2014a) 
and uses a three-step algorithm (density estimation, thresh¬ 
olding and gradient ascent) to detect filaments. Detailed im¬ 
plementations of SCMS can be found in Ozertem & Erdog¬ 
mus (2011) and Chen et al. (2015). Here we briefly discuss 
how density ridges are detected by SCMS. 

Let Xi , • • • , Xn. denote the locations of galaxies and 


p{x) 



(M), 


( 1 ) 


be the kernel density estimator where h is the smoothing 
bandwidth that controls the degree of smoothing, K is the 
Gaussian kernel, and \\x — y\\ is the Euclidean distance be¬ 
tween X and y. Note that p is also known as the kernel den¬ 
sity estimator (KDE) in statistical literature (Wasserman 
2006). We further define g{x) = Vp{x) and H{x) = Wp{x) 
to be the gradient and the Hessian matrix of p{x), respec¬ 
tively. 

The density ridges (Eberly 1996; Ozertem & Erdogmus 
2011; Genovese et al. 2014; Chen et al. 2014b, a) of p are the 
collection of points 

R= {x: vj{x)g{x) = 0,j = 2, 3, A2(x) < 0}, (2) 

where Vj{x),Xj{x) are the j-th eigenvector and eigenvalue, 
respectively, of H{x) and po is a density threshold. Essen¬ 
tially, SCMS outputs a set of points on R. See Figure 1 for 
an example. 

The idea of using eigen-structures of Hessian matrix 
of the galaxy density to detect filaments has been used in 
other filament finders; see, e.g., the skeleton (Novikov et al. 
2006), the Multiscale Morphology Filter (MMF; Aragon- 
Calvo et al. 2007, 2010a), the Smoothed Hessian Major 
Axis Filament Finder (SHMAFF; Bond et al. 2010b, a), the 
Spine method (Aragon-Calvo et al. 2010b), and the Dis- 
PerSE model (Sousbie 2011). 


2.2 SCMS: Uncertainty Measure 

An appealing property of SCMS is its uncertainty measure. 
We measure the error for filament detection via the boot¬ 
strap technique (Efron 1979). We bootstrap the original data 
and apply SCMS to the bootstrap sample. This exercise pro¬ 
vides gives a set of bootstrap filaments. We then compute 
the (projection) distances from R (a filament detected in 
the original sample) to each bootstrap filament. Thus, each 
point on R will be assigned an error value. By repeating the 
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(a) (b) (c) 

Figure 1. An illustration of the SCMS technique, (a): The original data, (b): Kernel density estimation (red-yellow: high density regions) 
and thresholding (removing points in low density regions), (c): The ridge estimation (blue curve). Essentially, the SCMS technique is to 
identify ridges in the galaxy density function estimated by the KDE; see section §2.1 for more involved discussion. 


bootstrap multiple times, for example 1,000 times, every 
point on R has 1, 000 error values. The mean of these boot¬ 
strap error values for each point is the uncertainty measure 
to the filaments detected by SCMS. More detailed discus¬ 
sion on the uncertainty measure can be found in Chen et al. 
(2015). 


2.3 Slicing the Universe 

The observed galaxy locations contain three variables, 
namely, the right ascension 02000 , the declination S 2000 , and 
the redshift z. We partition the range of redshift into sev¬ 
eral small intervals; this procedure slices of the Universe. We 
apply SCMS to galaxies within each slice to detect hlaments. 

We slice the Universe for three purposes. First, this ac¬ 
tion removes the Finger-of-God effect since galaxies at the 
same slice share nearly the same redshift. The Finger-of-God 
effect is produced by the small peculiar velocities of galaxies, 
so galaxy clusters are stretched out along the line of sight in 
redshift space. Thus, most filaments appear to be pointing 
toward the Earth although they may not really stretch along 
the line of sight. 

Second, slicing the redshift Universe reduces the com¬ 
putational cost dramatically. There are two barriers for com¬ 
putational complexity for SCMS. One is the number of ini¬ 
tialized points. In the third step of SCMS (filament detec¬ 
tion), we must select many initial grid points to perform an 
ascending process. This ascending process, called subspace 
constrained mean shift, pushes points until they arrive at 
ridges. The size of the Universe greatly increases as the red¬ 
shift increases, so we need many grid points to detect fila¬ 
ments. The other barrier for the computation is that SCMS 
requires evaluation of the Hessian of density function, which 
is known to be computationally intensive if the number of 
points is large or the dimension is high. Taking slices of the 
Universe reduces the dimension to two, and for each slice the 
sample size (the number of galaxies) is small so that SCMS 
can be performed within reasonable time. 

Third, slicing the Universe according to the redshift al¬ 
lows a comparison of filamentary structures at different red- 
shifts. Characteristics of filaments at different redshifts re¬ 


veal information about the nature of our Universe that can 
be used to constrain cosmological parameters. 


3 THE SDSS DATA 

We use a combined SDSS dataset that contains main galaxy 
sample (MGS) from DR7 and LOWZ and CMASS samples 
from DR12 (Alam et al. 2015). We describe the datasets in 
detail in the following sections: 


3.1 The NYU MGS Catalogue 

The SDSS DR7 (Abazajian et al. 2009) contains the com¬ 
pleted data set of SDSS-I and SDSS-II. These surveys ob¬ 
tained wide-held CCD photometry (Gunn et al. 1998, 2006) 
in hve passbands (u, g, r, i, z Fukugita et al. 1996; Doi et al. 
2010), internally calibrated using the ‘uber-calibration’ pro¬ 
cess described in Padmanabhan et al. (2008), amassing a to¬ 
tal footprint of 11,663 deg^. From this imaging data, galaxies 
within a footprint of 9380 deg^ (Abazajian et al. 2009) were 
selected for spectroscopic observation as part of the main 
galaxy sample (Strauss et al. 2002), which, to good approxi¬ 
mation, consists of all galaxies with < 17.77, where Vp^t 
is the extinction-corrected r-band Petrosian magnitude. In 
this analysis we do not consider the Luminous Red Galaxy 
extension of this program to higher redshift (Eisenstein et al. 
2001). 

We obtain the SDSS DR7 Main Galaxy Sample from the 
NYU value-added catalog (NYU VAGC, Blanton et al. 2005; 
Padmanabhan et al. 2008; Adelman-McGarthy et al. 2008). 
It includes K-corrected absolute magnitudes, and detailed 
information on the mask. This sample uses galaxies with 
14.5 < Tpet < 17.6. The Vpet > 14.5 limit ensures that only 
galaxies with reliable SDSS photometry are used and the 
rpe.t < 17.6 allows a homogeneous selection over the full 
footprint of 6141 deg^ (Blanton et al. 2005). Galaxies that 
did not obtain a redshift due to fibre collisions are assigned 
the redshift of their nearest neighbour. 
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3.2 The LOWZ and CMASS Catalogues 

The LOWZ and CMASS samples are from from Data Re¬ 
lease 12 (Alam et al. 2015) of the Sloan Digital Sky Survey 
SDSS. Together, SDSS I, II, and III imaged over one third 
of the sky (14,555 deg^) in u, g, r, i, z photometric band- 
passes to a limiting magnitude of r ~ 22.5. The imaging 
data were processed through a series of pipelines that per¬ 
form astrometric calibration (Pier et al. 2003), photometric 
reduction (Lupton et al. 2001), and photometric calibration 
(Padmanabhan et al. 2008). All of the imaging was repro¬ 
cessed as part of SDSS Data Release 8 (Aihara et al. 2011). 

The Baryon Oscillation Spectroscopic Survey (BOSS) 
of SDSS-III has obtained spectra and redshifts for 1.35 mil¬ 
lion galaxies over a footprint covering 10,000 square degrees. 
These galaxies are selected from the SDSS imaging (Aihara 
et al. 2011 ) and were observed together with 160,000 quasars 
and approximately 100,000 ancillary targets. The targets are 
assigned to tiles of diameter 3° using an algorithm (Blan¬ 
ton et al. 2003) that adopts to the density of targets on 
the sky (Blanton et al. 2003). Spectra are obtained using 
the double-armed BOSS spectrographs (Smee et al. 2013). 
Each observation is performed in a series of 900-second ex¬ 
posures, integrating until a minimum signal-to-noise ratio is 
achieved for the faint galaxy targets. This approach ensures 
a homogeneous data set with a high redshift completeness 
of more than 97% over the full survey footprint. Redshifts 
are extracted from the spectra using the methods described 
in Bolton et al. (2012). A summary of the survey design ap¬ 
pears in Eisenstein et al. (2011), a full description of BOSS 
is provided in Dawson et al. (2013). 

BOSS selects two classes of galaxies to be targeted for 
spectroscopy : ‘LOWZ’ and ‘CMASS’ (we refer the reader to 
Anderson et al. 2014 for further description of these classes). 
For the LOWZ sample, the effective redshift is Zes = 0.32, 
slightly lower than that of the SDSS-II luminous red galaxies 
(LRGs) as we place a redshift cut z < 0.43. The CMASS se¬ 
lection yields a sample with a median redshift 2 = 0.57 and 
a stellar mass that peaks at logj^Q(M/M 0 ) = 11.3 (Maras- 
ton et al. 2013). Most CMASS targets are central galaxies 
residing in dark matter haloes of mass ~ Mq. 


4 FILAMENT MAPS 

4.1 Construction of Filament Maps 

We construct filament maps using the three galaxy cata¬ 
logues: NYU MGS, LOWZ and GMASS. Figure 2 presents 
some examples of constructed filaments (blue) with galax¬ 
ies (black) and galaxy clusters (red) from the redMaPPer 
catalogue. Our construction of filament maps consists of the 
following steps: 

1. Slice the sample between 0.050 < z < 0.700 into 130 slices 
of width Az = 0.005 . 

2. Within each slice, select galaxies within 

150° < asooo < 200°, 5° < ^2000 < 30° 

since this is a relatively complete region for all three galaxy 
catalogues. 

3. Using KDE, compute the mean density and the root mean 
square (RMS) density for the selected galaxies. 



Galaxy Probability Density 


Figure 3. Distribution of density profile of filaments at different 
redshifts. The distribution of density profile is right-skewness at 
low redshift, indicating that galaxy density on each filament point 
is in general higher than those filaments at high redshift. 

4. Using the root mean square density as a threshold level in 
SGMS, construct filament maps. The RMS density is used 
in SGMS as a threshold level to stabilize the algorithm. 

5. Apply masks of galaxy catalogues to eliminate filaments 
outside the region of observations. 

6 . At each point on filaments, compute the filament’s local 
direction. 


4.2 Filament Maps 

Our filament map catalogue^ contains a collection of points 
on filaments. These points are obtained via SGMS with a 
uniform grid as the initial points. Thus, one can view the 
points in the filament maps as a uniformly random sample 
on all filaments. Each filament point has seven variables as 
listed in Table 1. The first two ( 0 : 2000 , ^ 2000 ) are the location 
within that slice, and Z|ow is the indicator (as well as the 
lower bound of redshift) for the slice. 

The density is the galaxy probability density within 
each slice under the ( 02000 , 520 oo)-coordinate at each filament 
point (the KDE is used to estimate the density). Thus, the 
total probability within each slice sums to 1. The density 
profile "Ff = ^f{z) is the distribution of density value at 
each filament point within the same slice and it evolves with 
redshift. We compare the density profile within different red¬ 
shift regions in Figure 3. An advantage for using density 
under ( 02000 , < 520 oo)-coordinates is that we do not have to 
renormalize the probability density because the size of each 
slice remains approximately the same. If we use ordinate 
cartesian coordinates, the size of each slice increases when 
the redshift increases. As can be easily seen, at higher red¬ 
shift, galaxy densities at filament points tends to be lower. 
The quantity H is a high-density indicator and is related to 
the RMS density. If the density of a given filament point is 
above the RMS density, H = 1, otherwise it is 0. 

^ The catalogue can be downloaded from https://sites. 
google.com/site/yenchicr/catalogue. 
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Z = 0.105-0.110 
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RA (deg) 



Z = 0.325-0.330 


100 150 200 250 

RA (deg) 



Z = 0.470-0.475 


100 150 200 250 

RA (deg) 

Figure 2. Examples for filament maps from the SDSS data. From top to bottom: ^ = 0.105 — 0.110 (NYU MGS), 0.325 — 0.330 (LOWZ) 
and 0.470 — 0.475 (CMASS). The blue curves are detected filaments, the red dots are galaxy clusters from redMaPPer catalogue, and 
the orange dots are intersections for filaments (details can be found in Appendix 5.2). 
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Notation 

Definition 

Comment 

“2000 

Right Ascension 


<52000 

Declination 


^iow 

Redshift 

Z|0W ^ 2 < Z|o„ + 0.005 

density 

Galaxy density at each filament point 


H 

High density indicator 

1: located at high density regions 

UM 

Uncertainty (Error) 


Vra 

Direction of filament 


^dec 

Direction of filament 



Table 1. Definition of variates in the filament map file. 


The quantity UM is the Icr uncertainty (error) for de¬ 
tected filaments. We measure the error for filaments by boot¬ 
strapping the SDSS data 100 times. Further details may be 
found in Chen et al. (2014a). 

The last two table’s entries are the orientation of fila¬ 
ments at each filament point. We use the density gradient 
at each point on filaments as a proxy to the direction. This 
proxy is known to be stable (Eberly 1996). 


5 FILAMENTS AT DIFFERENT REDSHIFTS 

The filament maps at each redshift are used to construct 
a summary file^ that contains information about filaments 
at different redshifts. This file consists of a 130 x 17 array. 
Each row corresponds to a particular slice of the Universe 
and each column provides information about that slice. We 
describe all 17 variables in the file in Table 2. 

The first variable (zio„) is the lower limit on redshift of 
that slice. Each slice contains the region 

Z|ow ^ Z < Z|ow + 0.005. 

The second variable (N) is the number of galaxies within 
these regions. Ngc is the number of galaxy clusters from 
redMaPPer catalogue (Rykoff et al. 2014; Rozo & Rykoff 
2014) within the slice. We use only the clusters with spec¬ 
troscopic redshifts that are within the mask of each SDSS 
catalogue. Eigure 5 shows N and Ngc at different redshifts. 
The left panel displays the galaxy sample size from three 
samples: the NYU main galaxy sample (black), the LOWZ 
sample (green) and the CM ASS sample (blue). The right 
panel presents the number of clusters at each slice. 

The number h is the smoothing bandwidth used in den¬ 
sity reconstruction and filament detection Chen et al. (2015). 
The left panel of Figure 6 shows the smoothing bandwidth 
at different redshifts. We select h according to the reference 
rule in appendix of Chen et al. (2015), which depends on 
the RMS of the density, h increases as the redshift increases 
because, at high redshift, the number density of galaxies is 
small so we need to enforce a strong degree of smoothing 
to detect filaments. The trend of h is similar to the inverse 
of the cube root of number density; see the right panel of 
Figure 7. 

The two variables Pmean,Prms are the mean overden¬ 
sity and the RMS density. The RMS measures the density 


^ See https://sites.google.com/site/yenchicr/catalogue. 


fluctuation of p{x) and is used in the thresholding proce¬ 
dure of the SCMS algorithm (Chen et al. 2015). We write 
Pmean(z) = Pmean and Prms(z:) = Prms siuce the mean density 
and RMS density change as the redshift changes. The cen¬ 
ter panel of Figure 6 displays Pmean(t) and prms(z). It is clear 
that Prms(2) decreases as redshift increases while the mean 
density Pmean (z) remains roughly the same. These effects oc¬ 
cur for two reasons. First, density fluctuations are smaller 
at early times (higher redshifts); second, the smoothing pa¬ 
rameter h is larger at higher redshifts so that the density 
estimate is strongly smoothed, reducing the amplitude of 
fluctuations. 

The quantity Fdensity is the average density profile of fil¬ 
aments at the given slice and is related to the result in Fig¬ 
ure 3, which shows the distribution of density profile at wide- 
redshift regions. The difference between Fdensity and Pmean is 
that Fdensity IS the average density value on filaments only 
while Pmean is the average density value on the whole re¬ 
gion of observation. The right panel of Figure 7 presents the 
over-density for filaments, which is defined as 

Q/ \ _ Fdensity (^) Pmean (z) 

J[Z) = . 

Prms(z:) 

The over-density shows how the filaments trace high density 
regions. If S{z) is large, then most filaments within this slice 
trace high density regions. As can be seen, the over-density 
for filaments decreases as redshift increases, implying that 
filaments do not trace high density regions so well at the 
high redshift range. There are many possible explanations 
for this result. At higher redshift regions, the number density 
is lower so that our filament reconstruction has larger errors. 
Another possibility is that, at higher redshift, the smooth¬ 
ing parameter h is also larger, which flattens the density 
fluctuation. 

The quantity dF is the average distance from all galax¬ 
ies to filaments within the specified slice. The related quan¬ 
tity dFn is analogous to dF, but uses the distance to ‘high- 
density’ filaments, i.e., the distance to filament points whose 
density is above the RMS density. The Left panel of Figure 
7 displays dF at each redshift. The average distance to fila¬ 
ments increases as redshift increases. This increasing pattern 
is caused by the change in number density- the higher red¬ 
shift regions generally have lower number density. To demon¬ 
strate how number density affects the average distance, we 
provide the inverse of the cube root of number density at 
each redshift at right panel of Figure 7; the pattern in the 
left and right panels are clearly similar. 

The quantity dFgc is the mean distance to filaments 
from galaxy clusters (redMaPPer clusters; Rykoff et al. 2014; 
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Figure 4. Examples of the uncertainty measures of the filament maps. From top to bottom: 2 = 0.105 — 0.110 (NYU MGS), 0.325 — 0.330 
(LOWZ) and 0.470 — 0.475 (CMASS). We use color to visualize the amount of uncertainty for filament detection (red = high uncertainty). 
Note that the color is relative uncertainty within each slice. 


© 2015 RAS, MNRAS 000, 1-14 











8 Yen-Chi Chen et al. 



Redshift 


<DO 

oo 


Galaxy Clusters 


0.1 0.2 0.3 0.4 0.5 0.6 0.7 

Redshift 


Figure 5. Left: Number of galaxies within each slice. At boundaries of two catalogues, the number of galaxies per slice is small. Right: 
Number of galaxy clusters from reMaPPer catalogue within each slice. The majority of reMaPPer clusters is in the regions of LOWZ 
sample. 



Redshift 


LO 



Redshift Redshift 


Figure 6. Left: Smoothing bandwidth over the redshift range 0.05 — 0.70. We must apply a larger smoothing bandwidth for data in 
higher redshift since the number density decreases. Center: The mean and the RMS (probability) density as a function of redshift. In 
generally, the mean density does not vary too much as the redshift changes. The RMS density, however, decreases when the redshift 
increases. See §5 for discussion about possibilities for this pattern. Right: The overdensed signal at different slices. The overdensed signal 
is the average density on all filament points within a slice minus pmean and divide prms- The over-density measures the quality of how 
filament trac high density regions. The decreasing pattern might come from higher errors for detecting filament at higher redshift or the 
side effect from smoothing. See §5 for more details. 



Figure 7. Left : The distance to filaments from galaxies. The displayed errors for left panel are multiplied by a factor of 20 to show 
the minuscule error. Center: The distance to filaments from galaxy clusters. Comparing the left panel to the center panel, we see that 
clearly clusters are closer to filaments than a randomly select galaxy. Right: The inverse of the cube root of number density n{z). This 
quantity has the unit of distance and is generally proportional to the average distance between galaxies. Distances to filaments from 
both galaxies and clusters have a similar trend as This reveals that the increasing pattern in redshift is due to the change in 

number density. 
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Notation 

Definition 

Unit 

Remark 

^low 

Redshift value t 


Z|qw ^ ^Iow 0.005 

N 

Galaxy number 



Ngc 

Galaxy cluster number 



^deg 

Smoothing bandwidth 

degree 


hmpc 

Smoothing bandwidth 

Mpc 


Pmean 

Mean galaxy density 

degree”^ 


Prms 

RMS of galaxy density 

degree”^ 


^ density 

Mean galaxy density on filaments 

degree”^ 


dF 

Mean galaxy distance to filaments 

Mpc 


dF^ 

Mean galaxy distance to high density filaments 

Mpc 


dFgc 

Mean cluster distance to filaments 

Mpc 

—1: Ngc = 0 

dFgc,ir 

Mean cluster distance to high density filaments 

Mpc 

1 

z 

o 

n 

II 

o 

UMqi 

First quantile for uncertainty of filaments 

Mpc 



Median uncertainty of filaments 

Mpc 


UMa-ug 

Mean uncertainty of filaments 

Mpc 


UMq3 

Third quantile for uncertainty of filaments 

Mpc 


U Mt-jtis 

Uncertainty fluctuation (RMS) of filaments 

Mpc 



Table 2. Definition of entries in the catalogue-description file. 



Redshift 

Figure 8. Summary statistics showing how filament uncertainty 
measures evolve with redshift. The increasing pattern along red- 
shift for all uncertainty measures is from the change in number 
density (c.f. right panel of Figure 7). 


Rozo & Rykoff 2014); dFgc,H is similar to dF// but is evalu¬ 
ated at each galaxy cluster. The quantity dFgc,r/ is the mean 
distance to the high-density filament from galaxy clusters. 
If Ngc = 0, both dFgc and dFg^.rr are set to be —1. The 
center panel of Figure 7 shows the mean distance of galaxy 
clusters dFgc under various redshifts. Basically, dFgc follows 
a similar trend as dF but has a lower value, indicating that, 
on average, clusters are closer to filaments than a randomly 
selected galaxy. 

Finally, the five quantities UMqi, UMmed, UMat,g, UMqs 
and UM_RAds are summary statistics for the uncertainty dis¬ 
tributes on filaments within each slice. These quantities are 
the first quantile (25%), median, mean, third quantile (75%) 
and root mean square for all the uncertainty values on fila¬ 
ment. The uncertainties are computed using the bootstrap 
method of Chen et al. (2015). The summary at different 
redshifts is presented in Figure 8. The increase of the un¬ 


certainty as a function of redshift is due to the change in 
number density (c.f. right panel of Figure 7). 


5.1 Filament Evolution 


The metric we adopt for quantifying the evolution of fila¬ 
ments is the ratio of galaxies and clusters within filaments 
at different redshifts. To account for the difference in number 
density due to the redshift, we first derive a scaled distance 
to the nearest filament for each galaxy (and cluster) using 
the smoothing parameter and uncertainty measures. Let D 
be the distance to filament from a galaxy, and tt be the near¬ 
est point on a filament and U be the uncertainty measure at 
TT (the uncertainty measure is defined only for points on fila¬ 
ments). The scaled distance (to the nearest filaments) from 
a specified galaxy is defined as 


S = 


1)2 -Ft/2 


/l2 


( 3 ) 


where h is the smoothing parameter. We divide the distance 
by smoothing parameter so that this scaled distance is com¬ 
parable from slice to slice (otherwise for galaxies at lower 
redshift, S will be much smaller than galaxies at higher red¬ 
shift). A galaxy (or a cluster) is classified as within a filament 
if 


S s: 0.3246. (4) 

The constant 0.3246 arises from the density of Gaussian dis¬ 
tribution. Let (j)iD — Gaussian distribu¬ 

tion. Then 

0(0.3246) 

0 ( 0 ) ~ 

If we convolve a true filament with a Gaussian, the resulting 
filamentary regions are those points with potential above 
90%. i.e., galaxies or clusters within these regions are recog¬ 
nized as being ‘within’ filaments. 

Figure 9 displays the proportion of galaxies from differ¬ 
ent catalogues as well as clusters that are ‘within’ filaments 
using criterion (3) and (4). The three color bars (black, red 
and blue) are the ‘mean’ proportion for NYU MGS, LOWZ 
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Redshift 

Figure 9. Fraction of galaxies and clusters within filaments at 
different redshifts. The fraction of clusters within filaments (pur¬ 
ple curves) roughly follows the trend of number density at dif¬ 
ferent redshifts. In general, our result suggests that roughly 30% 
galaxies are in filaments. 


and CMASS galaxies. The brown line is the result for galax¬ 
ies from all the samples at different redshifts, and purple 
lines are the ratios for clusters. For galaxies, there is a clear 
decrease with redshifts, with a small bump at 2 ~ 0.5. This 
region is the beginning of CMASS sample, so that the num¬ 
ber density is in fact increasing (see Figure 5), therefore our 
detection power is increasing. The width of error bar at Fig¬ 
ure 8 also drops at the 2 = 0.5, indicating the same pattern. 
This effect is stronger for galaxy clusters. 

Another useful statistic is the proportion of ‘stable’ fil¬ 
ament points. We classify a filament point as stable if 

UM sfUM-hfcUMflMS, (5) 

where UM and UM hms are the mean and the root mean 
square of the uncertainty over all filament points across ev¬ 
ery slice. The number k is the threshold level for defining a 
filament point as stable. 

Figure 10 displays the proportion of stable filaments as 
a function of redshift under k ranging from 0 to 2. For all 
k, we see a clear pattern that the ratio first drops and then 
increases and drops again. This phenomenon is even stronger 
at smaller k. This pattern is similar to that of the number 
of observations at each slice (cf. Figure 5). 


5.2 Filament Intersections 

As mentioned in the introduction, SCMS filaments have an 
attractive property that they have good agreement with 
known galaxy clusters. Chen et al. (2015) demonstrated that 
most clusters are generally close to the detected filaments. 
In this section, we identify intersections for filaments and 
compare them to locations of galaxy clusters. 

To obtain filament intersections, we apply a simple algo¬ 
rithm derived from metric graph reconstruction (Aanjaneya 
et al. 2012; Lecci et al. 2013), a method from computational 



Redshift 

Figure 10. Ratio of ‘stable’ filament points at different redshifts 
under different threshold levels. Changing the threshold for sta¬ 
bility reveals an interesting pattern for the ratio of stable filament 
points, which is in a similar trend of number of galaxies within 
each slice. See Section 5.1 for further details. 

geometry, to the filaments detected by SCMS The imple¬ 
mentation details can be found in Appendix A. 

Figure 2 presents an example of applying this detection 
algorithm to our filament maps. The orange color points are 
intersections. The detection algorithm clearly successfully 
identifies the intersection points, and most galaxy clusters 
are close to these points. 

To quantify the closeness of clusters to intersection 
points, we compute the distance from galaxy clusters to 
the intersection points at different redshifts and compare 
this distance statistic to the distance from a random galaxy 
point to the intersection. Figure 11 shows the distribution 
of distances from clusters (red) versus distance from galax¬ 
ies. We use the one-sided KS-test to compare the difference 
in distribution; the result is given in Table 3. The clusters 
are significantly closer to intersections for filaments com¬ 
pared to galaxies. The worst case (largest p-value) is at 
0.4 < 2 < 0.45. This region corresponds to the boundary 
between LOWZ and CMASS samples and is the region with 
the smallest number density of galaxies (cf. Figure 5). Thus, 
our filament detection algorithm lacks statistical power at 
this region, so it is expected that the p-value is largest here. 


6 MAGNITUDE AND DISTANCE TO 
FILAMENTS 

We use the filament maps to investigate the relation between 
a galaxy’s magnitude and its distance to filaments. Specifi¬ 
cally, we wish to determine whether galaxies near filaments 
tend to be more luminous. 

We separately analyze each of the three galaxy cat¬ 
alogues (NYU MGS, LOWZ and CMASS). For each cat¬ 
alogue, we slice the redshift into several bins with width 
A 2 = 0.005 that matches our filament maps. We focus on 

® We also provide the intersections of filaments in https: // 
sites.google.com/site/yenchicr/catalogue. 
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Figure 11. Distributions of distances from clusters (red) and galaxies (blue) to intersection points. Each panel is the result for a 
particular redshift region. In every panel, we observe that clusters generally have shorter distances to intersection points than randomly 
selected galaxies. To make this statement quantitatively, we perform KS test for each pair of distributions. The result is given in Table 3. 



^^^^ ^ ^ ^ ^ ^ J 

0.10 0.15 0.20 0.25 0.20 0.25 0.30 0.35 0.40 0.45 

Redshift Redshift 





0.45 0.50 0.55 

Redshift 



Figure 12. Top row: Selected magnitude regions for each sample (purple rectangles). Galaxies are selected within the purple rectangles 
to obtain a volume-limited sample. There is a strong cut on magnitude along the redshifts due to the observational limit. We have reversed 
the direction for Y-axis (magnitude) so that a galaxy in the upper region indicates that it is bright. Bottom row: Absolute magnitude 
(r-band) versus distance to filaments for volume-limited samples. The orange line is the boundary between decreasing pattern and random 
fluctuation. A piecewise linear regression is fitted to the data to select the orange line (selected values: 24.67 Mpc for NYU MGS, 51.67 
Mpc for LOWZ and 57.34 Mpc for CMASS).On the left side of the orange line, there is a strong decreasing trend, while on the right 
side, the patterns exhibit random fluctuations. 
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Redshift 

p-value 

Redshift 

p-value 

0.100-0.150 

2.79 X 10-11 

0.300-0.350 

2.68 X 10-® 

0.150-0.200 

2.55 X 10“® 

0.350-0.400 

2.16 X lO-'i' 

0.200-0.250 

2.58 X 10-® 

0.400-0.450 

1.28 X 10-2 

0.250-0.300 

4.47 X 10“® 

0.450-0.500 

4.95 X 10“® 


Table 3. Significances generated from a one-sided, two-sample 
KS test, for the null hypothesis that galaxy clusters lie at the 
same average distance from intersections as field galaxies, p-value 
is a statistical quantity to measure the significance. Typically, the 
usual rejection rule requires p < 0.05. 


the regions 0.1 < z < 0.6 since the reMaPPer cluster cat¬ 
alogue mainly covers this redshift range. For each slice, we 
remove galaxies whose distance to galaxy clusters is less than 
5 Mpc, thus eliminating the effect of galaxy clusters. 

Since the SDSS dataset is not volume limited, we had to 
apply additional constraints to construct a volume-limited 
sample. Our selection rule is 

{NYU MGS) Mr- < -21.5, 0.10 ^ z 0.20 

(LOWZ) Mr < -22.5, 0.20 sj « sj 0.45 

{CMASS) Mr < -21.5, 0.45 sj « sj 0.60. 


The first row of Figure 12 shows the luminosity-redshift re¬ 
gion within each of the NYU MGS, LOWZ and CMASS 
samples. This figure reveals the strong luminosity-redshift 
dependence. 

The bottom row of Figure 12 presents the relation be¬ 
tween magnitude and distance to filaments for each sample. 
Every sample processes a strong dependence of magnitude 
and distance (to filaments). Galaxies near filaments are gen¬ 
erally more luminous than those at greater distance from fil¬ 
aments. This relation vanishes after certain range of distance 
(distance to the nearest filament). To determine where the 
increasing pattern disappears, we fit the following piecewise 
linear function: 


M{x) 


Po + Pix iix <Xr 

Po + PlXc A X ^ Xc 


( 6 ) 


where M{x) is the magnitude and x is the distance to fila¬ 
ments. Namely, M{x) is a linear curve when x is less than 
the critical distance Xc and is a constant after the critical 
distance. The optimal fit suggests that Xc for NYU MGS 
sample is 24.67 Mpc, for LOWZ sample is 51.67 Mpc and 
for CMASS sample is 57.34 Mpc. This phenomena can be 
explained by the uncertainty of filaments. The uncertainty 
in filaments will smooth out the impact that the distance to 
filaments has on magnitude. From Figure 8, the uncertain¬ 
ties for filaments within the NYU MGS, LOWZ and CMASS 
samples are 8,15, 20 Mpc, respectively. This is why the effect 
spans longer distances at high redshifts. 

The slope /3i in (6) determines the strength, as well 
as the significance, for the decreasing pattern and is given 
in Table 4. According to Table 4, we observe a significant 
evidence (at 6.1a— 12.3a) that the luminosity is indeed neg¬ 
atively correlated with the distance before the critical dis¬ 
tance. 


7 CONCLUSION 

In this paper, we construct a series of two-dimensional fila¬ 
ment maps from SDSS data using the SCMS algorithm. We 
provide several statistics to measure the properties of the 
filamentary maps we constructed at each redshift. These 
measurements may be used to study the evolution of the 
Universe and constrain cosmology. 

We compare our publicly available catalogue to the ex¬ 
isting catalogues for filaments introduced in Sousbie et al. 
(2008), Jasche et al. (2010), Smith et al. (2012), and Tempel 
et al. (2014). Each of these catalogues provide some anal¬ 
ysis for the large-scale structure over the whole Universe 
by using different models for filaments. However, none of 
them is publicly available. This makes it difficult for other 
research groups to use these catalogues to analyze filaments. 
Moreover, unlike our catalogue all these catalogues do not 
provide any measurement on the errors for filament detec¬ 
tion and only focus on the small redshift range (less than 
2 : = 0.25). To our knowledge, our filament catalogue is by 
far the only filament catalogue for redshift 2 > 0.25 in the 
SDSS. 

We apply our filament maps to investigate the galaxy 
luminosity-filament distance relarion using a volume-limited 
sample. There is a long distance effect from filaments (more 
than 20 Mpc) on the brightness of galaxies, which is at a 
different scale than Guo et al. (2015), where they found a 
similar pattern at a much smaller scale (distances less than 
0.71 Mpc). Although part of the long distance effect can be 
explained by the errors of filaments, our results suggest that 
the correlation between galaxy magnitude and distance to 
filaments may extend over distances >> 1 Mpc. 
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NYU MGS (z = 0.10 - 0.20) 

LOWZ (z = 0.20 - 0.43) 

CMASS (z = 0.43 - 0.70) 

Estimate 

-11.82 X 10-°"* 

-4.34 X 10-0“ 

-5.13 X 10-0“ 

Sloped Standard Error 

1.92 X lO-O'* 

6.53 X 10-05 

4.17 X 10-05 

Significance 

6.15(t 

6.64(t 

12.31(7 


Table 4. Linear fit for the three catalogues for absolute magnitude versus distance to filaments, f A negative slope indicates that the 
luminosity decreases as the distance (to filaments) increases. 


Vanderbilt University, University of Virginia, University of 
Washington, and Yale University. 


APPENDIX A: ALGORITHM FOR 
DETECTING INTERSECTION POINTS 

In this section, we describe the metric graph reconstruc¬ 
tion algorithm (Aanjaneya et al. 2012; Lecci et al. 2013) 
for detecting intersection points of filaments. Our algorithm 
examines every point on the filaments and assigns it into 
the ‘intersection’ class or ‘non-intersection’ class using the 
following process. Let a; be a point we wish to examine. 

1. Keep those data points whose distance to x is between ri„ 
and Tout, two parameters. 

2. Cluster the remaining points using hierarchical clustering 
with radius r^ep, i.e., partitioning points into several groups 
such that group-group distance is greater than r^ep. 

3. Count the number of groups from previous step. If the num¬ 
ber of groups is greater or equal to three, classify x as an 
intersection point, otherwise classify it as non-intersection. 

The idea behind this algorithm is that when a point is at the 
intersection, other points around this point within the shell 
(form by rin and rout) should have at least three clusters. 
For an edge point, there will be two clusters and for the 
end point, there is only one cluster. Points near the same 
intersection may all be classified as intersection points; we 
use the mean location as intersection point: 

rin. = 2/i/3, rout — rsep — (rtn -t- rout)D. 

This choice of parameters is ad hoc but works well in prac¬ 
tice. 
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